fix: updated mlx related pkgs + end condition for stopping model request loop by madclaws · Pull Request #111 · tilesprivacy/tiles

madclaws · 2026-03-28T12:20:06Z

No description provided.

…est loop

coderabbitai · 2026-03-28T12:20:55Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: ec12eb76-d604-41b7-b301-fae97a935e7d

📥 Commits

Reviewing files that changed from the base of the PR and between 8f68bfd and c4f58ec.

⛔ Files ignored due to path filters (1)

server/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (14)

modelfiles/qwen
pkg/scripts/preinstall
server/api.py
server/backend/mlx.py
server/hf_downloader.py
server/pyproject.toml
server/stack/requirements/app-server/packages-app-server.txt
server/stack/requirements/app-server/pylock.app-server.meta.json
server/stack/requirements/app-server/pylock.app-server.toml
server/stack/requirements/app-server/requirements-app-server.in
server/stack/venvstacks.toml
server/throttled_download_worker.py
tiles/src/main.rs
tiles/src/runtime/mlx.rs

📝 Walkthrough

Walkthrough

This PR updates many pinned Python dependencies and lockfiles (notably bumping mlx, mlx-lm, transformers, replacing requests/urllib3 with httpcore/httpx, and adding several tokenizer/CLI packages), adds a preinstall script, and changes container/model files (qwen base image). It removes the explicit model download flow: deletes server/hf_downloader.py, server/throttled_download_worker.py, the /download POST endpoint, and a download helper in server/backend/mlx.py; model provisioning is now performed implicitly via existing model-loading calls. A small runtime change adjusts reply extraction behavior in the MLX Rust runtime and replaces logger initialization in main.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant ServerAPI as Server API
participant Backend as MLX Backend
participant ModelCache as Model Cache
participant HF as HuggingFace (external)
Client->>ServerAPI: request model download (old flow)
ServerAPI->>Backend: pull_model request
Backend->>HF: snapshot_download (throttled worker)
HF-->>Backend: model artifacts
Backend->>ModelCache: write model files
Backend-->>ServerAPI: success/failure
ServerAPI-->>Client: response

mermaid
sequenceDiagram
participant Client
participant ServerAPI as Server API
participant Backend as MLX Backend
participant ModelCache as Model Cache
Client->>ServerAPI: request inference/start
ServerAPI->>Backend: get_or_load_model(model_spec)
Backend->>ModelCache: check cache
alt cached
ModelCache-->>Backend: model path
else not cached
Backend->>ModelCache: attempt implicit load (load_model)
ModelCache-->>Backend: model path / error
end
Backend-->>ServerAPI: model ready
ServerAPI-->>Client: inference response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Bundle models for the offline installer #101: Changes model provisioning to use model-cache-aware loading and adds get_or_load_model/get_model_cache paths — closely related to removal of explicit downloader and implicit loading here.
v0.3.0 + Linux compatibility #30: Modifies server/backend and API model-loading code; overlaps with this PR's removal of download helpers and endpoint.
pkg enhancements #97: Alters model provisioning flow and Rust helpers for model-cache/download; directly related to runtime/cache behavior changed in this PR.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	No pull request description was provided by the author, making it impossible to assess whether the description is related to the changeset.	Add a pull request description that explains the purpose of the MLX package updates and the end condition logic change in tiles/src/runtime/mlx.rs.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the two main changes: MLX package updates and an addition to handle the end condition for stopping model request loops in the Rust code.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/qwen3.5-load-issue

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/stack/requirements/app-server/pylock.app-server.toml`:
- Around line 233-245: The call to snapshot_download is receiving the removed
parameter "local_dir_use_symlinks" (it's set to False at the earlier assignment
and ends up in kwargs_dict passed to snapshot_download), which will raise a
TypeError with huggingface-hub 1.8.0; remove that key from kwargs_dict before
calling snapshot_download (e.g., pop/remove "local_dir_use_symlinks" from
kwargs_dict in the fallback/throttled-worker download path so the
snapshot_download call no longer receives the unexpected keyword).

In `@server/stack/venvstacks.toml`:
- Line 29: The exception handlers around snapshot_download in
server/hf_downloader.py (the except block catching
requests.exceptions.HTTPError/ConnectionError/Timeout at lines ~120-127) must be
updated to handle the newer huggingface-hub v1.8.0 behavior: replace or extend
the caught exceptions to include httpx.HTTPError and the Hugging Face
hub-specific HTTP error (HfHubHttpError / HfHubHTTPError) and add the
corresponding imports, while keeping any existing requests-based catches as a
fallback so both old and new backends are supported; update the except clause(s)
to catch (httpx.HTTPError, HfHubHttpError) alongside the existing
requests.exceptions.*.

In `@tiles/src/runtime/mlx.rs`:
- Line 553: Remove the unconditional debug print that uses println! of the
variable content (the line containing println!("{}", content)); either delete
that line or replace it with a conditional debug log (e.g., tracing::debug! or
log::debug!) behind a feature flag so the full content is not printed to stdout
in production. Locate the println! call in tiles/src/runtime/mlx.rs where
content is printed and switch to a non-stdout, conditional logging mechanism or
remove it entirely.
- Around line 557-559: The sentinel "END" return in extract_reply (used in
mlx.rs) is causing downstream logic to treat it as a real assistant message
(affecting g_reply emptiness checks and causing "END" to be stored in
conversations/DB); change extract_reply to return Option<String> (None for the
loop-termination case) or otherwise stop returning the literal "END", and update
convert_to_chat_response and all callers (the code that checks g_reply, the
conversation push logic around conversations and DB persistence) to explicitly
handle None as "terminate/skip storing" while preserving empty-string replies as
valid responses so the existing g_reply.is_empty() check still works. Ensure
references: extract_reply, convert_to_chat_response, g_reply, conversations, and
the DB persist path are updated to treat the termination signal separately from
a real reply.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 9716b2ef-9554-478b-869d-7f91a9ee317a

📥 Commits

Reviewing files that changed from the base of the PR and between d874af2 and 8f68bfd.

⛔ Files ignored due to path filters (1)

server/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (11)

server/pyproject.toml
server/stack/requirements/app-server/packages-app-server.txt
server/stack/requirements/app-server/pylock.app-server.meta.json
server/stack/requirements/app-server/pylock.app-server.toml
server/stack/requirements/app-server/requirements-app-server.in
server/stack/requirements/framework-mlx/packages-framework-mlx.txt
server/stack/requirements/framework-mlx/pylock.framework-mlx.meta.json
server/stack/requirements/framework-mlx/pylock.framework-mlx.toml
server/stack/requirements/framework-mlx/requirements-framework-mlx.in
server/stack/venvstacks.toml
tiles/src/runtime/mlx.rs

coderabbitai · 2026-03-28T12:27:53Z

+httpcore==1.0.9
+httpx==0.28.1
+huggingface-hub==1.8.0


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n -C2 'import requests|requests\.exceptions|requests\.(get|post)\s*=|snapshot_download\s*\(|HfHubHttpError|set_(async_)?client_factory' server/hf_downloader.py server/throttled_download_worker.py

Repository: tilesprivacy/tiles

Length of output: 2917

🏁 Script executed:

cat -n server/stack/requirements/app-server/packages-app-server.txt

Repository: tilesprivacy/tiles

Length of output: 1515

🏁 Script executed:

# Check if requests appears anywhere in the requirements file rg -n "requests" server/stack/requirements/app-server/packages-app-server.txt

Repository: tilesprivacy/tiles

Length of output: 44

🌐 Web query:

huggingface-hub 1.8.0 dependencies requests httpx

💡 Result:

huggingface-hub 1.8.0 requires httpx as a core dependency (>=0.23.0, <1), but does not require requests. Since v1.0, the library has migrated from requests to httpx as its HTTP backend, dropping requests as a dependency. httpx is listed under core dependencies on PyPI, while requests is absent. It also appears as an optional dependency in some extras (e.g., oauth), but not required for base installation.

Citations:

1: https://pypi.org/project/huggingface-hub/

2: https://github.com/huggingface/huggingface_hub

3: https://archlinux.org/packages/extra/any/python-huggingface-hub/

4: https://huggingface.co/docs/huggingface_hub/en/concepts/migration

Update downloader code to use huggingface-hub v1's httpx-based HTTP backend instead of relying on requests.

requests is no longer in the requirements, but server/throttled_download_worker.py imports it (line 29), monkeypatches requests.get/post (lines 99–100), and catches requests.exceptions.* (lines 117, 132). Similarly, server/hf_downloader.py imports requests in its fallback path (line 116) and catches requests.exceptions.* (lines 122, 125, 128). Hub v1.8.0 migrated from requests to httpx and no longer includes requests as a dependency. The current monkeypatching approach will fail, and the custom throttling will not affect Hub traffic. Use set_client_factory() or set_async_client_factory() to hook into Hub's HTTP client instead.

codecov · 2026-03-28T12:31:58Z

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
tiles/src/main.rs	0.00%	3 Missing ⚠️
tiles/src/runtime/mlx.rs	0.00%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

- pre-install script to remove prev server folder by the installer - removed iroh error logs showing in release build - removed HF deps as downloader is done via RUST

fix: updated mlx related pkgs + end condition for stopping model requ…

8f68bfd

…est loop

madclaws linked an issue Mar 28, 2026 that may be closed by this pull request

Bug: Qwen 3.5 models not running #103

Closed

coderabbitai Bot reviewed Mar 28, 2026

View reviewed changes

fix: added pre-install script + log cleanup + removed HF deps

c4f58ec

- pre-install script to remove prev server folder by the installer - removed iroh error logs showing in release build - removed HF deps as downloader is done via RUST

madclaws merged commit fdc886c into main Mar 29, 2026
2 of 3 checks passed

madclaws deleted the fix/qwen3.5-load-issue branch March 29, 2026 20:51

coderabbitai Bot mentioned this pull request Apr 5, 2026

server: OpenAI API and llama.cpp Web UI compatibility #121

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: updated mlx related pkgs + end condition for stopping model request loop#111

fix: updated mlx related pkgs + end condition for stopping model request loop#111
madclaws merged 2 commits intomainfrom
fix/qwen3.5-load-issue

madclaws commented Mar 28, 2026

Uh oh!

coderabbitai Bot commented Mar 28, 2026 •

edited

Loading

Review failed

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 28, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

madclaws commented Mar 28, 2026

Uh oh!

coderabbitai Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Mar 28, 2026 •

edited

Loading

codecov Bot commented Mar 28, 2026 •

edited

Loading